Speaker identification using cepstrum in Kannada language
نویسندگان
چکیده
Speaker identification in forensic cases involves careful estimation of the features that are more specific to speaker. The anatomical differences in vocal tract depict speaker related differences. Cepstrum has been investigated as a possible parameter for speaker identification. Fundamental frequency obtained from cepstral coefficients indirectly depicts the shape and size of vocal tract. In the present study quefrency and amplitude was extracted using cepstrum spectral analysis technique under various recording conditions. 30 normal males were selected. For reading intention, four paragraphs were selected having long vowels /a:/ , /i:/ and /u:/ embedded in the words of the paragraphs of Kannada passage. To elicit spontaneous speech from the subjects, six Kannada words having long vowels in the medial position were considered. Subject’s speech and reading paragraphs were recorded in field conditions to suit realistic forensic situations. Using CSL-4500 software, cepstral coefficients quefrency and amplitude were extracted for the long vowels /a:/, /i:/ and /u:/. Extracted parameters were normalized and Euclidian distances between speakers were measured. The results indicated that the percent correct identification was above chance level for direct vs. direct and mobile vs. mobile recording (DS Vs. DS = 68%, MS Vs. MS = 64%, DR Vs. DR = 75%, and MR Vs. MR = 62%). Results of 4-way repeated measures ANOVA revealed that significant difference between speakers on quefrency and interaction between speaking style, recording, set and vowels on quefrency. Results of paired t-test reveals the significant difference between direct and mobile recording for long vowels /a:/, /i:/ and /u:/ on quefrency.
منابع مشابه
Native Language Identification Based on English Accent
Present work is aimed at investigating the influence of mother tongue (L1) of a South Indian speaker on a second language (L2). Second language can be a dominant local language, national language in India i.e., Hindi or a connecting language English. In the current study, L2 is a short discourse in English. Cepstral and prosodic features were used as in Language Identification (LID) to distingu...
متن کاملTwo Stage Neural Network model for Recognition of Indian Languages from Speech
India is a multilingual country. Officially about 20 languages are recognized by the government and there are about 500 languages spoken at different parts of the country. For developing the speech systems in Indian context, it is necessary to capture the language specific knowledge automatically from speech. Further it may be exploited for different speech tasks such as language identification...
متن کاملClipped LPC Cepstrum and Its Application to Text-Independent Speaker Identification
A new modification of the LPC cepstrum of speech signal called clipped LPC (CLPC) cepstrum is proposed. In the CLPC cepstrum is reduced the influence of the low level LPC spectrum’s regions. Three LPC cepstrums as features in a textindependent speaker identification task were evaluated using reading text in Bulgarian language collected over noisy telephone lines. These cepstrums are: standard L...
متن کاملA Comparative Analysis of Speaker Identification on English and Hindi Database
In this paper a text-dependent speaker recognition method is presented by combining Mel frequency cepstrum coefficients (MFCC) and Euclidean distance. The robustness of this speaker identification method for different speaking language is analyzed in this paper. The speaker identification algorithm using English and Hindi Indian voice database (IVD) which contains sentences of data spoken is ac...
متن کاملLanguage and Text-Independent Speaker Identification System Using GMM
This paper motivates the use of Dynamic Mel-Frequency Cepstral Coefficient (DMFCC) feature and combination of DMFCC and MFCC features for robust language and text-independent speaker identification. MFCC feature, modeled on the human auditory system has been the widely used feature for speaker recognition because of its less vulnerability to noise perturbation and little session variability. Bu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012